Model Selection

Wav2Vec2 Architecture

# Wav2Vec2 Architecture

Indicwav2vec Odia

Hindi automatic speech recognition (ASR) model based on Wav2Vec2 architecture, developed by AI4Bharat

Speech Recognition

Transformers Other

Audio Classification Model

An audio classification model fine-tuned based on facebook/wav2vec2-base-960h, with specific uses and training data not clearly specified.

Audio Classification

This is a fine-tuned model for speech language identification (LID) across 512 languages, based on the Wav2Vec2 architecture, capable of recognizing the language category of input audio.

Speech Recognition

Transformers Supports Multiple Languages

This is a speech language identification model based on the Wav2Vec2 architecture, capable of recognizing 256 languages, and is part of Facebook's Massively Multilingual Speech (MMS) project.

Audio Classification

Transformers Supports Multiple Languages

A language identification model fine-tuned from Facebook's Massively Multilingual Speech project, supporting audio classification for 126 languages

Audio Classification

Transformers Supports Multiple Languages

Chinese Hubert Base

A Chinese speech model pretrained on 10,000 hours of WenetSpeech L subset, suitable for speech-related tasks

Speech Recognition

TencentGameMate

A fine-tuned speech recognition model based on facebook/wav2vec2-base, supporting automatic speech-to-text tasks.

Speech Recognition

A speech recognition model fine-tuned from facebook/wav2vec2-base, achieving a Word Error Rate (WER) of 1.0 on the evaluation set

Speech Recognition

Wav2vec2 Xlsr 53 Russian Emotion Recognition

This is a Russian speech emotion recognition model based on the XLS-R Wav2Vec2 architecture, capable of identifying 7 basic emotions with an accuracy of 72%.

Audio Classification

Transformers Other

A fine-tuned speech recognition model based on facebook/wav2vec2-base with a Word Error Rate (WER) of 1.0

Speech Recognition

This model is a speech recognition model fine-tuned based on facebook/wav2vec2-base-960h, achieving a word error rate (WER) of 1.0 on the evaluation set.

Speech Recognition

English Filipino Wav2vec2 L Xls R Test 07

This model is a fine-tuned version of jonatasgrosman/wav2vec2-large-xlsr-53-english on Filipino speech datasets, primarily used for English-to-Filipino speech recognition tasks.

Speech Recognition

Wav2vec2 Base Timit Demo Colab3

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model

Speech Recognition

Wav2vec2 Base Timit Demo Colab1

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained and evaluated on the TIMIT dataset.

Speech Recognition

cuzeverynameistaken

Wav2vec2 Base Timit Demo Colab60

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained for 60 epochs on the TIMIT dataset with a word error rate (WER) of 1.0.

Speech Recognition

Wav2vec2 Base Timit Demo Colab7

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained on the TIMIT dataset with a Word Error Rate (WER) of 0.5426.

Speech Recognition

Wav2vec2 Base Timit Demo Colab3

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained on the TIMIT dataset with an evaluated word error rate of 0.5608.

Speech Recognition

Wav2vec2 Base Timit Demo Colab2

This model is a speech recognition model fine-tuned from facebook/wav2vec2-base, achieving a word error rate (WER) of 0.5664 on the evaluation set.

Speech Recognition

Wav2vec2 Base Timit Demo Colab6

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained on the TIMIT dataset with a word error rate (WER) of 0.5282.

Speech Recognition

Wav2vec2 Base Timit Moaiz Explast

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base on the TIMIT dataset, primarily used for English speech-to-text tasks.

Speech Recognition

Wav2vec2 Base Timit Demo Colab1

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained on the TIMIT dataset with a Word Error Rate (WER) of 1.0.

Speech Recognition

Ctrlv Wav2vec2 Tokenizer

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a 31.38% word error rate on the evaluation set

Speech Recognition

Wav2vec2 Base Toy Train Data Slow 10pct

A speech recognition model fine-tuned on an unknown dataset based on facebook/wav2vec2-base, with a Word Error Rate (WER) of 0.7175

Speech Recognition

Wav2vec Tr Lite AG

This is a Turkish automatic speech recognition model based on the XLSR Wav2Vec2 architecture, trained on the Common Voice Turkish dataset.

Speech Recognition Other

Wav2vec2 Timit Demo

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model

Speech Recognition

Wav2vec2 From Scratch Finetune Dummy

This is an Indonesian automatic speech recognition model based on the XLSR Wav2Vec2 architecture, developed by cahya and fine-tuned on the Common Voice Indonesian dataset.

Speech Recognition

Transformers Other

Viwav2vec2 Base 100h

A base Wav2Vec2 model pretrained on 100 hours of unlabeled Vietnamese speech audio from the VLSP dataset, requiring fine-tuning for downstream tasks.

Speech Recognition

Transformers Other

Hindi Wav2vec2 Stt

A Hindi speech recognition model based on the Wav2Vec2 architecture that directly transcribes audio into text.

Speech Recognition

Wav2vec2 Xlsr Greek Speech Emotion Recognition

A Greek speech emotion recognition model based on the Wav2Vec 2.0 architecture, capable of identifying five emotions: anger, disgust, fear, happiness, and sadness.

Audio Classification Other

Wav2vec2 Xls R 300m English

XLS-R-300M is an English automatic speech recognition model fine-tuned on the librispeech_asr dataset based on facebook/wav2vec2-xls-r-300m, achieving a word error rate of 12.29% on the LibriSpeech test set.

Speech Recognition

Transformers English

Wav2vec2 Base Timit Demo Colab 1

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained on the TIMIT dataset with an evaluation set word error rate (WER) of 0.3874.

Speech Recognition

Wav2vec2 Base Timit Demo Colab

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model

Speech Recognition

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base-960h, achieving a word error rate of 21.61% on the evaluation set.

Speech Recognition

Timit 5percent Supervised

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-large-lv60, using 5% of the data for supervised training

Speech Recognition

Wav2vec2 Base Timit Demo Colab

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model, specializing in English speech-to-text tasks.

Speech Recognition

An Urdu automatic speech recognition (ASR) model fine-tuned from Facebook's wav2vec2-xls-r-1b model, trained on the Common Voice 8.0 Urdu dataset

Speech Recognition

Transformers Other

HarrisDePerceptron

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase